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We discuss recent algorithmic improvements in simulating finite temperature QCD on 
a lattice. In particular, the Rational Hybrid Monte Carlo(RHMC) algorithm is employed 
to generate lattice configurations for 2+1 flavor QCD. Unlike the Hybrid R Algorithm, 
RHMC is reversible, admitting a Metropolis accept/reject step that eliminates the 0(5t 2 ) 
errors inherent in the R Algorithm. We also employ several algorithmic speed-ups, includ- 
ing multiple time scales, the use of a more efficient numerical integrator, and Hasenbusch 



pre-conditioning of the fermion force. 
1. Finite Temperature Lattice QCD 



Lattice QCD can be used to probe the non-perturbative properties of QCD at finite 
temperature. The partition function, Z(V,T) can be written as: 



Z(V 1 T) = J VA u V^exp(-S E (V,T)) (1) 

This expression corresponds to the path integral for a four- dimensional field theory with 
temporal extent r = 1/T. 

The RBC-Bielefeld Collaboration is currently performing large-scale simulations of fi- 
nite temperature lattice QCD to calculate important physical quantities such as the tran- 
sition temperature T c and the QCD Equation of State (EoS). Here, we discuss the algo- 
rithmic improvements used in these simulations. 

2. Hybrid Molecular Dynamics (HMD) 

Hybrid Molecular dynamics (HMD) uses both a molecular dynamics evolution and 
Monte Carlo techniques to sample configuration space. Start by introducing a field of 
traceless, Hermitian SU(3) matrices, Pj. The Pi act like conjugate momenta to the gauge 
links Ui, so the pseudo-Hamiltonian is: 

H = Y,Tr{P?/2) + S QC D (2) 



Hamilton's equations are used to evolve the system along an (approximately) energy- 
conserving trajectory of fixed length (r= 0.5 or 1.0). At the end of a trajectory, Pi are 
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randomly refreshed by coupling to a heat bath, sending the next trajectory in a different 
direction. The amount of phase space sampled ~ N, compared to ~ yN for random-walk 
type algorithms. 

In our case, using staggered fermions, we can integrate out the Grassmann fields, which 
gives (rif denoting the number of degenerate quark flavors): 

Z = J [VU] det(A4) n ' /4 exp(-S s ) (3) 

For rif = 4, one employs the $ algorithm [H], which uses pseudo-fermion fields to evaluate 
the determinant of the fermion matrix. Since the $ algorithm satisfies detailed balance, 
we can use a reversible integrator that admits a Metropolis accept/reject step. 

This is not the case for rif ^ 4, where the fermion determinant results in a non-local 
term in the effective action. The R algorithmf X evaluates this non-local term using a 
noisy estimator. However, this introduces unacceptably large 0(St) discretization errors 
when numerically integrating the equations of motion. These errors can be reduced to 
0(5t 2 ), but only at the cost of using a non-reversible numerical integrator. 

3. Rational Hybrid Monte Carlo (RHMC) 

Recently, it was realized that one can use an optimal, rational approximation to evaluate 
det(M) n f /4 [^. The idea is to use a rational approximation that is valid to some arbitrary 
precision over the spectral range of the fermion operator. 

det(A4) n ' /4 = j [P0][P0]exp(0.M~ n//4 0) = J [X?0][X?0]exp(0r 2 (.M)0) (4) 

where r(x) ~ x~ nf ^ 8 . The rational approximation allows the use of a reversible integrator 
with a Metropolis accept/reject step, regardless of the number of flavors [0]. RHMC is 
exact - it lacks the discretization errors associated with the R Algorithm. The RHMC is 
also more efficient - one can use a much larger integration step size than the one typically 
used for the R Algorithm (5t « 0.4m/). As seen in Fig. 1, recent comparisons of the two 
algorithms show that step-size errors can sometimes be quite severe, especially for finite 
temperature simulations. [ EJ |3] 

4. Algorithmic Improvements 

4.1. Omelyan Integrator 

The Omelyan integrator [El U\ has much smaller 0(5t 2 ) errors compared to the standard 
leapfrog integrator that is normally used. As a result, one can use much larger integration 
step sizes during a molecular dynamics evolution, leading to a 50% speedup without 
affecting the acceptance rate. 

4.2. Multiple time steps 

The gauge field, light quarks, and strange quark contribute different amounts to the 
force during a molecular dynamics trajectory. The largest contributions to the force, the 
gauge force, must be calculated most often, but is actually the least expensive computa- 
tionally. By using multiple time scales for the gauge force and the fermion force during 
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Figure 1. Finite step size errors in the 
chiral condensate using the R Algorithm. 
3f QCD, 8 3 x 4 volume, p4fat7 fermion 
action, m q a = 0.1 



Figure 2. Magnitudes of gauge, light 
quark, and strange quark forces without 
quotient pre-conditioning (left) and with 
quotient pre-conditioning(right) 



integration, we can increase speed while maintaining constant acceptance by tuning the 
impulses from each force update to approximately the same value. [Ej- 

4.3. Quotient Pre-conditioning 

The force coming from the light quark and the strange quark are actually quite simi- 
lar, differing because of small contributions from the lightest modes. Therefore, we can 
"precondition" the light quark kernel Aii by dividing by the strange quark kernel -M S [E]. 



As shown in Fig. 2, quotient preconditioning drastically reduces the force from the light 
quark, allowing us to calculate the light quark force less often. Since CG count is domi- 
nated by the light quark inversion, this allows for potentially large performance increases. 

5. Comparisons of different techniques 

Table ^ is a comparison of the different algorithmic techniques. At the lightest quark 
mass (m q = 0.lm s ), these improvements give a 20x reduction in CG count compared to 
the R Algorithm. Even for heavier masses (m q = 0Am s ) we still get a 7x or 8x speedup. 

6. Summary and Future plans 

The use of the RHMC, along with various other algorithmic improvements, have greatly 
accelerated finite temperature lattice simulations, in some cases gaining as much as a 
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Table 1 

Comparison for different algorithms on finite temperature lattices 



2+lf p4fat3, m s a = .065, (3 = 3.30, 16 3 x 4, 8 3 x 4 
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*Denotes use of quotient preconditioning 

f A configuration is "accepted" with probability = min(l, exp(— SH)) 



factor of 20 in speed compared to old techniques. This vastly expands the parameter 
space accessible to finite temperature lattice simulations with staggered-type fermions. In 
the future we hope to test other possible improvements, such as using a 0(5t A ) integrator 
for finite temperature lattices. 
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